功能负载(FL)通过口碑对与lexicon制作的区别的贡献来定量贡献。以前的研究与声音变化有特别低的曲线。在这里,我们将探究范围扩大到FL,以其所有价值观的演变。我们应用系统发育方法,以检查澳大利亚帕玛尼蒙(PN)家族的90种语言的FL的历复演变。我们在FL中发现了高度的系统发育信号。虽然已经报告了系统发育信号进行语音结构,例如语音术,但其在语音功能测量中的检测是新颖的。我们还在元音长度和以下辅音的FL之间发现了一个重要的负相关,即深入的历史权衡动态,我们与现代PN语言中的已知阿拉孔和过去的补偿声音变化相关。该发现揭示了一种类似于翻蛋白的历史动态,我们作为音韵子系统之间的对比流动。我们的发现在跨越整个大陆和多千年的语言系列中,我们的发现提供了Sapir'漂移'假设的最具令人讨厌的例子之一,在历史相关的语言中不小心平行的发展。
translated by 谷歌翻译
As causal inference becomes more widespread the importance of having good tools to test for causal effects increases. In this work we focus on the problem of testing for causal effects that manifest in a difference in distribution for treatment and control. We build on work applying kernel methods to causality, considering the previously introduced Counterfactual Mean Embedding framework (\textsc{CfME}). We improve on this by proposing the \emph{Doubly Robust Counterfactual Mean Embedding} (\textsc{DR-CfME}), which has better theoretical properties than its predecessor by leveraging semiparametric theory. This leads us to propose new kernel based test statistics for distributional effects which are based upon doubly robust estimators of treatment effects. We propose two test statistics, one which is a direct improvement on previous work and one which can be applied even when the support of the treatment arm is a subset of that of the control arm. We demonstrate the validity of our methods on simulated and real-world data, as well as giving an application in off-policy evaluation.
translated by 谷歌翻译
Using 3D CNNs on high resolution medical volumes is very computationally demanding, especially for large datasets like the UK Biobank which aims to scan 100,000 subjects. Here we demonstrate that using 2D CNNs on a few 2D projections (representing mean and standard deviation across axial, sagittal and coronal slices) of the 3D volumes leads to reasonable test accuracy when predicting the age from brain volumes. Using our approach, one training epoch with 20,324 subjects takes 40 - 70 seconds using a single GPU, which is almost 100 times faster compared to a small 3D CNN. These results are important for researchers who do not have access to expensive GPU hardware for 3D CNNs.
translated by 谷歌翻译
脑小血管疾病的成像标记提供了有关脑部健康的宝贵信息,但是它们的手动评估既耗时又受到实质性内部和间际变异性的阻碍。自动化评级可能受益于生物医学研究以及临床评估,但是现有算法的诊断可靠性尚不清楚。在这里,我们介绍了\ textIt {血管病变检测和分割}(\ textit {v textit {where valdo?})挑战,该挑战是在国际医学图像计算和计算机辅助干预措施(MICCAI)的卫星事件中运行的挑战(MICCAI) 2021.这一挑战旨在促进大脑小血管疾病的小而稀疏成像标记的自动检测和分割方法的开发,即周围空间扩大(EPVS)(任务1),脑微粒(任务2)和预先塑造的鞋类血管起源(任务3),同时利用弱和嘈杂的标签。总体而言,有12个团队参与了针对一个或多个任务的解决方案的挑战(任务1 -EPVS 4,任务2 -Microbleeds的9个,任务3 -lacunes的6个)。多方数据都用于培训和评估。结果表明,整个团队和跨任务的性能都有很大的差异,对于任务1- EPV和任务2-微型微型且对任务3 -lacunes尚无实际的结果,其结果尤其有望。它还强调了可能阻止个人级别使用的情况的性能不一致,同时仍证明在人群层面上有用。
translated by 谷歌翻译
数据增强是自然语言处理(NLP)模型的鲁棒性评估的重要组成部分,以及增强他们培训的数据的多样性。在本文中,我们呈现NL-Cogmenter,这是一种新的参与式Python的自然语言增强框架,它支持创建两个转换(对数据的修改)和过滤器(根据特定功能的数据拆分)。我们描述了框架和初始的117个变换和23个过滤器,用于各种自然语言任务。我们通过使用其几个转换来分析流行自然语言模型的鲁棒性来证明NL-Upmenter的功效。基础架构,Datacards和稳健性分析结果在NL-Augmenter存储库上公开可用(\ url {https://github.com/gem-benchmark/nl-augmenter})。
translated by 谷歌翻译
随着混凝剂的数量增加,因果推理越来越复杂。给定护理$ x $,混淆器$ z $和结果$ y $,我们开发一个非参数方法来测试\ texit {do-null}假设$ h_0:\; p(y | \ text {\它do}(x = x))= p(y)$违反替代方案。在Hilbert Schmidt独立性标准(HSIC)上进行边缘独立性测试,我们提出了后门 - HSIC(BD-HSIC)并证明它被校准,并且在大量混淆下具有二元和连续治疗的力量。此外,我们建立了BD-HSIC中使用的协方差运算符的估计的收敛性质。我们研究了BD-HSIC对参数测试的优点和缺点以及与边缘独立测试或有条件独立测试相比使用DO-NULL测试的重要性。可以在\超链接{https:/github.com/mrhuff/kgformula} {\ texttt {https://github.com/mrhuff/kgformula}}完整的实现。
translated by 谷歌翻译
在数值天气和气候模型中的云结构的处理通常很大程度上是大大简化的,以使它们计算得起价格实惠。在这里,我们建议使用计算廉价的神经网络来纠正欧洲的中等天气预报1D辐射方案ECRAD,用于3D云效应。 3D云效应被学习为ECRAD快速1D Tripleclouds疏忽它们的差异及其3D Spartacus(通过云侧辐射传输的快速算法),其中包括它们的求解器,但大约是计算昂贵的五倍。在3D信号的20到30%之间的典型误差,神经网络的准确性提高了运行时增加约1%。因此,而不是模仿整个斯巴达斯,我们将Tripleclouds保持不变的气氛的无云部分和在其他地方的3D矫正它。如果我们假设两者的相似的信噪比,则对相对小的3D校正而不是整个信号的焦点允许显着提高预测。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译
In the past years, deep learning has seen an increase of usage in the domain of histopathological applications. However, while these approaches have shown great potential, in high-risk environments deep learning models need to be able to judge their own uncertainty and be able to reject inputs when there is a significant chance of misclassification. In this work, we conduct a rigorous evaluation of the most commonly used uncertainty and robustness methods for the classification of Whole-Slide-Images under domain shift using the H\&E stained Camelyon17 breast cancer dataset. Although it is known that histopathological data can be subject to strong domain shift and label noise, to our knowledge this is the first work that compares the most common methods for uncertainty estimation under these aspects. In our experiments, we compare Stochastic Variational Inference, Monte-Carlo Dropout, Deep Ensembles, Test-Time Data Augmentation as well as combinations thereof. We observe that ensembles of methods generally lead to higher accuracies and better calibration and that Test-Time Data Augmentation can be a promising alternative when choosing an appropriate set of augmentations. Across methods, a rejection of the most uncertain tiles leads to a significant increase in classification accuracy on both in-distribution as well as out-of-distribution data. Furthermore, we conduct experiments comparing these methods under varying conditions of label noise. We observe that the border regions of the Camelyon17 dataset are subject to label noise and evaluate the robustness of the included methods against different noise levels. Lastly, we publish our code framework to facilitate further research on uncertainty estimation on histopathological data.
translated by 谷歌翻译
In large-scale machine learning, recent works have studied the effects of compressing gradients in stochastic optimization in order to alleviate the communication bottleneck. These works have collectively revealed that stochastic gradient descent (SGD) is robust to structured perturbations such as quantization, sparsification, and delays. Perhaps surprisingly, despite the surge of interest in large-scale, multi-agent reinforcement learning, almost nothing is known about the analogous question: Are common reinforcement learning (RL) algorithms also robust to similar perturbations? In this paper, we investigate this question by studying a variant of the classical temporal difference (TD) learning algorithm with a perturbed update direction, where a general compression operator is used to model the perturbation. Our main technical contribution is to show that compressed TD algorithms, coupled with an error-feedback mechanism used widely in optimization, exhibit the same non-asymptotic theoretical guarantees as their SGD counterparts. We then extend our results significantly to nonlinear stochastic approximation algorithms and multi-agent settings. In particular, we prove that for multi-agent TD learning, one can achieve linear convergence speedups in the number of agents while communicating just $\tilde{O}(1)$ bits per agent at each time step. Our work is the first to provide finite-time results in RL that account for general compression operators and error-feedback in tandem with linear function approximation and Markovian sampling. Our analysis hinges on studying the drift of a novel Lyapunov function that captures the dynamics of a memory variable introduced by error feedback.
translated by 谷歌翻译